Efficient Computation of Address Sequences in Data Parallel Programs Using Closed Forms for Basis Vectors

نویسندگان

Ashwath Thirumalai

J. Ramanujam

چکیده

Arrays are mapped to processors through a two-step process—alignment followed by distribution—in dataparallel languages such as High Performance Fortran. This process of mapping creates disjoint pieces of the array that are locally owned by each processor. An HPF compiler that generates code for array statements must compute the sequence of local memory addresses accessed by each processor and the sequence of sends and receives for a given processor to access non-local data. In this paper, we present an approach to the address sequence generation problem using the theory of integer lattices. The set of elements referenced can be generated by integer linear combinations of basis vectors. Unlike other work on this problem, we derive closed form expressions for the basis vectors as a function of the mapping of data. Using these basis vectors and exploiting the fact that there is a repeating pattern in the access sequence, we derive highly optimized code that generates the pattern at runtime. The code generated uses table-lookup of the pattern. Experimental results show that our approach is faster than other solutions to this problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

An Efficient Compile-Time Approach to Compute Address Sequences in Data Parallel Programs

Arrays are mapped to processors through a two step process—alignment followed by distribution—in languages such as High Performance Fortran. This process of mapping creates disjoint pieces of the array that are locally owned by each processor. An HPF compiler that generates code for array statements must compute the sequence of local memory addresses accessed by each processor and the sequence ...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

متن کامل

Address Sequence Generation for Data-Parallel Programs Using Integer Lattices

In data-parallel languages such as High Performance Fortran and Fortran D, arrays are mapped to processors through a two step process involving alignment followed by distribution. A compiler that generates code for each processor has to compute the sequence of local memory addresses accessed by each processor and the sequence of sends and receives for a given processor to access non-local data....

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

J. Parallel Distrib. Comput.

دوره 38 شماره

صفحات -

تاریخ انتشار 1996

Efficient Computation of Address Sequences in Data Parallel Programs Using Closed Forms for Basis Vectors

نویسندگان

چکیده

منابع مشابه

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

An Efficient Compile-Time Approach to Compute Address Sequences in Data Parallel Programs

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

Address Sequence Generation for Data-Parallel Programs Using Integer Lattices

عنوان ژورنال:

اشتراک گذاری